%%
%% Copyright 2007, 2008, 2009 Elsevier Ltd
%%
%% This file is part of the 'Elsarticle Bundle'.
%% ---------------------------------------------
%%
%% It may be distributed under the conditions of the LaTeX Project Public
%% License, either version 1.2 of this license or (at your option) any
%% later version.  The latest version of this license is in
%%    http://www.latex-project.org/lppl.txt
%% and version 1.2 or later is part of all distributions of LaTeX
%% version 1999/12/01 or later.
%%
%% The list of all files belonging to the 'Elsarticle Bundle' is
%% given in the file `manifest.txt'.
%%

%% Template article for Elsevier's document class `elsarticle'
%% with numbered style bibliographic references
%% SP 2008/03/01

%% \documentclass[preprint,12pt]{elsarticle}

%% Use the option review to obtain double line spacing
%% \documentclass[authoryear,preprint,review,12pt]{elsarticle}

%% Use the options 1p,twocolumn; 3p; 3p,twocolumn; 5p; or 5p,twocolumn
%% for a journal layout:
\documentclass[final,1p,times]{elsarticle}
%% \documentclass[final,1p,times,twocolumn]{elsarticle}
%% \documentclass[final,3p,times]{elsarticle}
%% \documentclass[final,3p,times,twocolumn]{elsarticle}
%% \documentclass[final,5p,times]{elsarticle}
%% \documentclass[final,5p,times,twocolumn]{elsarticle}

%% For including figures, graphicx.sty has been loaded in
%% elsarticle.cls. If you prefer to use the old commands
%% please give \usepackage{epsfig}

%% The amssymb package provides various useful mathematical symbols
\usepackage{amssymb}
%% The amsthm package provides extended theorem environments
%% \usepackage{amsthm}

%% The lineno packages adds line numbers. Start line numbering with
%% \begin{linenumbers}, end it with \end{linenumbers}. Or switch it on
%% for the whole article with \linenumbers.
%% \usepackage{lineno}
\usepackage{psfrag}
\usepackage{textcomp}
\usepackage{epstopdf}
 \usepackage{subfigmat}

\journal{Journal of Computational Physics}

\begin{document}

\begin{frontmatter}

%% Title, authors and addresses

%% use the tnoteref command within \title for footnotes;
%% use the tnotetext command for theassociated footnote;
%% use the fnref command within \author or \address for footnotes;
%% use the fntext command for theassociated footnote;
%% use the corref command within \author for corresponding author footnotes;
%% use the cortext command for theassociated footnote;
%% use the ead command for the email address,
%% and the form \ead[url] for the home page:
%% \title{Title\tnoteref{label1}}
%% \tnotetext[label1]{}
%% \author{Name\corref{cor1}\fnref{label2}}
%% \ead{email address}
%% \ead[url]{home page}
%% \fntext[label2]{}
%% \cortext[cor1]{}
%% \address{Address\fnref{label3}}
%% \fntext[label3]{}

\title{Optimized prefactored compact schemes for wave propagation phenomena}

%% use optional labels to link authors explicitly to addresses:
%% \author[label1,label2]{}
%% \address[label1]{}
%% \address[label2]{}

\author[label1]{A. Rona\corref{cor1}}
\ead{ar45@leicester.ac.uk}
\ead[url]{http://www.le.ac.uk/eg/ar45/}
\author[label2]{I. Spisso}
\ead{i.spisso@cineca.it}
\author[label1]{E. Hall}
\ead{eh171@leicester.ac.uk}
\author[label3]{S. Pirozzoli}
\ead{sergio.pirozzoli@uniroma1.it}
\author[label3]{M. Bernardini}
\ead{matteo.bernardini@uniroma1.it}
\cortext[cor1]{Corresponding author. Tel.: +44 (0)116 2522510; fax: +44 (0)116 2522619.}
\address[label1]{Department of Engineering, University of Leicester, Leicester, LE1 7RH, England}
\address[label2]{Cineca, via Magnanelli 6/3, 40033 Casalecchio di Reno, Italy}
\address[label3]{Universit\`a degli Studi di Roma ‘‘La Sapienza”, Dipartimento di Meccanica e Aeronautica, Via Eudossiana 18, 00184 Roma, Italy}

\begin{abstract}
%% Text of abstract
A new prefactored cost-optimized scheme is developed to minimize the computational cost for a given level of error in modelling wave propagation phenomena, such as aerodynamic sound propagation. This work extends the theory of Bernardini and Pirozzoli~\cite{pirozzoli-2007-JCP} to the prefactored compact high-order schemes of Hixon~\cite{hixon-00-JCP}.

Theoretical prediction for spatial and temporal error bounds are determined and compared against benchmark classical schemes. The performance of popular schemes for $CAA$ applications and the cost-optimized schemes are compared in terms of computational efficiency.

High-order boundary closures, which are accurate and stable within a given Fourier space envelope, are coupled with the interior prefactored schemes. The stability of the prefactored cost-optimized schemes coupled with these boundary closures is verified by an eigenvalue analysis.

A monochromatic sinusoidal test-case has verified the theoretical roll-off error against the computed $\overline{L}_{2}$ norm error, indicating that the cost-optimized schemes perform according to the design high-order accuracy characteristics for this class of problems.

Numerical experiments have verified that the design cost-optimization of the schemes is achieved. A $22 \%$ computational cost-saving at the design level of error is recorded. The percentage cost-saving is envisaged to be higher for a level of error one decade lower than the design level of error and even more in a multi-dimensional space.
\end{abstract}

\begin{keyword}
%% keywords here, in the form: keyword \sep keyword
Computational aeroacoustics \sep
Prefactored compact schemes \sep
Optimized algorithms \sep
Numerical dispersion \sep
Numerical dissipation
%% PACS codes here, in the form: \PACS code \sep code

%% MSC codes here, in the form: \MSC code \sep code
%% or \MSC[2008] code \sep code (2000 is the default)

\end{keyword}

\end{frontmatter}

%% \linenumbers

%% main text
\section{Introduction}\label{sec:introduction}
\subsection{Challenges in modelling wave generation and propagation phenomena}
Models for the propagation of waves in a continuum are developed across the full spectum of physical sciences, including aero-acoustics, where increasingly stringent aircraft noise regulations~\cite{whitepaper-2003,acare-08-addendum} promote the development of accurate and affordable methods for predicting aerodynamically generated noise. Enhanced Computational Aeroacoustic (CAA) schemes are sought that can model sound generation and its propagation as part of the industrial design process, where predictions of accuracy compatible with the design specifications are required at an affordable cost, produced within a specific time-frame, using multi-processor computer clusters.

Computational aeroacoustic schemes that model the sound generation and propagation as one simulation typically face a number of challanges, such as in the direct approach to modelling trailing edge noise, which is a typical exemplary aeroacoustic problem. Flow structures of small boundary layer thickness scale cross a trailing edge while exhibiting large, free-stream dynamic pressure scale, fluctuations in momentum. The interaction generates small amplitude acoustic pressure perturbations of long wavelength, compared to the boundary layer thickness. These radiate at the speed of sound, which is much larger than the flow structure mean convection speed. (cite to be added?)

Separation of length scales, low dispersion and dissipation requirements, are contextualised by the trailing edge noise example problem.
\subsection{Prefactored compact finite-difference schemes}
Different aeroacoustic problems exhibit different flow physics. Therefore no single algorithm is available to model all problems with adequate resolution and accuracy. Low Mach number acoustically active flows typically involve capturing complex features that are nevertheless computationally smooth, that is, these features do not involve sharp a sharp change in the flow state, such as from a shock. In these circumstances, it is computationally advantageous to use higher than second-order (high-order) schemes that can achieve (Manolis/Ed Hall to add asymptotics here logN?) convergence rates by increasing the scheme's order as opposed to (1/N?) by increasing the spatial mesh refinement. Several numerical methods for aeroacoustics have emerged in the last two decades~\cite{deroeck-04-ISMA,kurbatskii-04-IJCFD,colonius-04-PAS} with attendant applications documented in four proceedings of the Computational Aeroacoustic (CAA) workshops on benchmark problems~\cite{hardin-95-NASACP,hardin-96-NASACP,hardin-00-NASACP}. 
Lele~\cite{lele-92-JCP} pioneered the use of Pad\'e type compact and explicit optimized schemes in aeroacoustics. This work highlighted the requirement for special near-boundary treatment, driven by the longer finite-different stencils used in these high-order methods.
Hixon~\cite{hixon-00-JCP} has introduced a prefactorization method to reduce the non-dissipative central-difference stencil of the compact schemes to two lower-order biased stencils which have easily solved reduced matrices. The advantages of these schemes over traditional compact schemes arise from their reduced stencil size and the independent nature of the resultant factored matrices. By reducing the stencil size of the compact schemes, the prefactorization method reduces the required number of boundary stencils, thereby simplifying boundary specification~\cite{hixon-96-NASACR}. Ashcroft and Zhang~\cite{ashcroft-03-JCP} has extended the factorization concept to a broader class of compact schemes using a more general derivation strategy, which combines Fourier analysis with the notation of a numerical wavenumber. This class of optimized prefactored schemes enhances the
wave propagation characteristics of the schemes. The proposed schemes exhibit better wave propagation characteristics than the standard prefactored compact ones.
\subsection{Optimization of the numerical methods}
Several finite-difference explicit and compact methods are now available for solving wave propagation problems in a low Mach number flow. It is typical to involve a high performance computer cluster for such a computation, where the cost of the computation is important. It is therefore of interest to develop and implement an optimization strategy on the computational cost, based on an acceptable level of numerical accuracy of the results.
The issue of computational efficiency of finite-difference schemes has been investigated in details by Colonius and Lele~\cite{colonius-04-PAS} and by Spisso and Rona~\cite{rona-spisso-07-ICSV14}. These authors have considered the behaviour of several types of spatial discretizations, implicitly assuming exact time integration. The error associated with approximate time integration is usually considered separately from the spatial error. Pirozzoli~\cite{pirozzoli-2007-JCP} has developed a general strategy for the analysis of finite-difference schemes for wave propagation problems, trying to involve time integration in the analysis in a natural way. The analysis of the global discretization error has shown the occurrence of two approximately independent sources of error, associated with the space and time discretizations. The improvement of the performance of the global scheme can be achieved by trying to separately minimize the two contributions. The analysis leads to rational and simple criteria for 
deriving optimized space- and time-discretization schemes, based on the concepts of spatial and temporal resolving efficiency. A careful design of the space- and time-discretization schemes, as well as an appropriate choice of the grid spacing and of the time step, can yield substantial computer time savings.
\subsection{Obtaining and testing optimized pre-factored compact schemes}
The aim of the present work is to extend the computational cost optimization method of Pirozzoli~\cite{pirozzoli-2007-JCP} to prefactored compact schemes, assess by numerical experiment the actual computational computational cost the newly developed optimized schemes, and verify that the level of accuracy of the solution is retained. In attaining this aim, several challenges were overcome. For the spatial differentiation, compact stencils optimized for set levels of error required an analytical pre-factorization that retained the non-dissipative characteristics of the equivalent compact centred scheme and satisfied additional symmetry relationships of pre-factored functions. A new set of near-boundary stencils was developed to provide pre-factored spatial derivative estimates on approach to the computational boundaries. For this, analytical statements of compatibility between the pre-factored boundary and interior derivatives, in a spectral sense, were formulated. The stability map of the pre-factored scheme 
coupled with pre-factored computational domain closures was determined and the presence of slow growing modes affecting the stability of the computation identified. Finally, the coupling of the optimized spatial discretization with a separately optimized temporal integration by Bernardini and Pirozzoli~\cite{bernardini-pirozzoli-09-JCP} for Runge-Kutta schemes was studied analytically, before obtaining a space and time cost-optimized time-marching scheme.
The numerical tests for verifying the level of accuracy and computational cost of the optimized schemes posed a number of challenges. Each accuracy set-point used for optimizing the scheme produced one set of coefficients for the discrete pre-factored spatial differentiation and temporal integration. Each set required testing over a discrete Courant and wavenumber space of $8030$ points to obtain numerically iso-cost and iso-accuracy maps to compare with the a-priori analytical predictions. A computational platform was required that enabled the concurrent running of these multiple tests and that used individually allocated (sequestrated) processor and memory resources, so that reliable computational cost data could be collected. These numerical challenges were met by implementing a dedicated job scheduler software and by administrator-level configuration settings on a High Performance Computer cluster at the Italian National Research Center facility Cineca.
\subsection{Paper outline}
The paper is organized as follows: in Section~\ref{sec:optimization_of_finite_difference_schemes}, the theory of the cost-optimization of Pirozzoli~\cite{pirozzoli-2007-JCP}, based on the optimization of the computational cost for a given error level, is reviewed and the approximate decoupling of the spatial and temporal error is introduced. The de-coupled errors are used in Section~\ref{sec:scheme development} to obtain a new family of prefactored cost-optimized schemes applicable to different accuracy levels of numerical simulations. In this section, the algebraic symmetry properties of the prefactorized stencils from Hixon~\cite{hixon-00-JCP} are used to define a new spatial differentiation for cost-optimized compact schemes that is prefactored. One-sided stencils are developed for application near the computational domain boundaries and the spatial differentiation is coupled with a Runge-Kutta time integration that is cost-optimised to a matching level of accuracy by the cost-optimization process of 
Bernardini and Pirozzoli~\cite{bernardini-pirozzoli-09-JCP}. Section~\ref{sec:scheme analysis} presents an eigenvalue based analysis of the time-marching scheme, inclusive of the effect of the computational boundary closures. The theoretical stability limits of the numerical scheme are determined in wavenumber and Courant number space. Numerical tests are reported in Section~\ref{sec:numerical_tests} that compare the actual accuracy of the computational schemes with the design value, calibrate the analytic cost function with the recorded time for the actual computations, and investigate the difference between the optimal set-point operation identified in the numerical tests and the design one. In Section~\ref{sec:n-space_applications}, the changes in the cost-optimization procedure required to extend it to three-dimensional wave propagation problems are identified and the potential for gaining computational savings in multi-dimensional problems illustrated by the use of a modified analytic cost model. 
Conclusions from the current work and future perspective are presented in Section~\ref{sec:conclusion}.

\section{Optimization of finite-difference schemes for wave propagation phenomena}
\label{sec:optimization_of_finite_difference_schemes}

\begin{figure}
	  \begin{center}
	  \hspace*{0mm} $x_{1}$ \hspace{4.5mm} $x_{2}$ \hspace{11.0mm} $x_{i-2}$ \hspace{3.5mm} $x_{i-1}$  \hspace{5.0mm} $x_{i}$  \hspace{5.5mm} $x_{i+1}$ \hspace{4.0mm} $x_{i+2}$ \hspace{6.0mm} $x_{N-1}$ \hspace{3.0mm} $x_{N}$\\
	  \hspace*{0mm} \textopenbullet--------\textopenbullet
\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered
\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered
\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered
\textopenbullet--------\textopenbullet--------\textbullet--------\textopenbullet--------\textopenbullet
\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered
\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered
\textperiodcentered\textopenbullet--------\textopenbullet\\
	  \hspace*{12mm} $\leftarrow$ h $\rightarrow$ \\
	  \hspace*{0mm} $\leftarrow---------------L--------------\rightarrow$ \\
	  \hspace*{0mm} $f_{1}$ \hspace{5.0mm} $f_{2}$ \hspace{11.0mm} $f_{i-2}$
\hspace{3.5mm} $f_{i-1}$  \hspace{5.0mm} $f_{i}$  \hspace{5.5mm} $f_{i+1}$
\hspace{4.0mm} $f_{i+2}$ \hspace{6.5mm} $f_{N-1}$ \hspace{4.0mm} $f_{N}$\\
	  \hspace*{0mm} \textopenbullet--------\textopenbullet
\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered
\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered
\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered
\textopenbullet--------\textopenbullet--------\textbullet--------\textopenbullet--------\textopenbullet
\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered
\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered\textperiodcentered
\textperiodcentered\textopenbullet--------\textopenbullet\\
	  \end{center}
      \caption{Variation of discrete function $f_i = f(x_i)$ along uniformly discretised streamwise length $L$.}
 \label{fig:mesh nodes}
\end{figure}

\begin{figure}
 \centering
\psfrag{x}[][2.5]{$\kappa$}
\psfrag{y}[][2.5]{$\sigma$}
\includegraphics[width=0.7\textwidth,keepaspectratio=true]{fig/local_C12rk4.eps}
\caption[Iso-contours of normalized `local' error function and normalized one-dimensional cost function for $C1122/RK4$ scheme.]{Iso-contours of normalized `local' error function $e (\kappa, \sigma)$
(solid lines) and normalized one-dimensional cost function $1/(\sigma
\kappa^{2})$ (dashed lines), for $C1122/RK4$ scheme.}
 \label{fig:local_C12rk4}
\end{figure}

\begin{figure}
 \centering
\psfrag{x}[][2.5]{$\check{\kappa}$}% devo mettere anche \kappa ??
\psfrag{y}[][2.5]{$\check{\sigma}$}
\includegraphics[width=0.7\textwidth,keepaspectratio=true]{fig/local_vs_global_r4c12.eps}
\caption[Iso-contours of normalized `local' and `global' error functions for $C1122/RK4$ scheme.]{Iso-contours of normalized `local' error function $e(\kappa, \sigma)$ (dashed lines) and `global' error function $\check{e} (\check{\kappa}, \check{\sigma})$ (dotted lines), for $C1122/RK4$ scheme.}
 \label{fig:local_vs_global_C12rk4}
\end{figure}

\begin{figure}
 \psfrag{x}[][2.5]{$\kappa$}
 \psfrag{y}[][2.5]{$\sigma$}
\includegraphics[width=13.0cm]{fig/approx_c12rk4.eps}
\caption[Iso-contours of the normalized approximate `local' error function for $C1122/RK4$ scheme.]{Iso-contours of normalized `local' error function $e(\kappa,\sigma)$ (black dotted lines); long-dashed dark and light blue lines represent the corresponding approximation given respectively by eq.~\ref{eq:error space Lele} and~\ref{eq:error time Hu} for $C1122/RK4$ scheme; filled black dots indicates the `optimal' working condition.}
\label{fig:approx_local_c12rk4}
\end{figure}

\clearpage
\section{Extension to time-marching prefactored compact schemes}
\label{sec:scheme development}

\subsection{Spatial differentiation by prefactored compact finite-difference schemes}
\label{ssec:prefactored compact schemes}

\subsection{Optimized prefactored compact schemes}
\label{ssec:optimized prefactored compact schemes}

\begin{figure}% the data are in the file /home/ivan/Documents/PhD_thesis/PhD_thesis/matlabfile/c12epsm.m
\centering
\psfrag{b}[B1][B1][1.][0]{$\kappa/\pi$}
\psfrag{y}[B1][B1][1.][0]{$Re\left(\bar{\kappa}(\kappa)/\pi\right)$}
\psfrag{w}[B1][B1][1.][0]{$\kappa$}
\psfrag{x}[B1][B1][1.][0]{$Im\left(\bar{\kappa}^{F,B}(\kappa)\right)$}
\psfrag{z}[B1][B1][1.][0]{$Im\left(\bar{\kappa}^{F}(\kappa)\right)$}
\psfrag{n}[B1][B1][1.][0]{$Im\left(\bar{\kappa}^{B}(\kappa)\right)$}
  \begin{subfigmatrix}{4}
\subfigure[\label{fig:pref_c12epsma}]{\includegraphics[width=0.49\textwidth]{fig/pref_c12epsm.eps}}
\subfigure[\label{fig:pref_c12epsmb}]{\includegraphics[width=0.49\textwidth]{fig/pref_c12Imepsm.eps}}
\subfigure[\label{fig:pref_c12epsmc}]{\includegraphics[width=0.49\textwidth]{fig/pref_c12ImFwdepsm.eps}}
\subfigure[\label{fig:pref_c12epsmd}]{\includegraphics[width=0.49\textwidth]{fig/pref_c12ImBwdepsm.eps}}
\end{subfigmatrix}
\caption[Dispersive characteristics of the prefactored classical $C1122$ and cost-optimized  $C12$\textsf{epsm}$5$, $C12$\textsf{epsm}$4$,
$C12$\textsf{epsm}$3$ schemes.]{Dispersive characteristics of the prefactored classical $C1122$ and cost-optimized  $C12$\textsf{epsm}$5$, $C12$\textsf{epsm}$4$,
$C12$\textsf{epsm}$3$ schemes. (a) Real component of the prefactored forward stencil from eq.~\ref{eq:forward real scaled pseudo-wavenumber}. (b) Imaginary components of the prefactored forward and backward stencil, respectively from eq.~\ref{eq:forward Im modified wavenumber} and~\ref{eq:backward Im scaled pseudo-wavenumber}. (c) Positive imaginary portion from (b). (d) Negative imaginary portion from (b).}
\label{fig:pref_c12epsm}
\end{figure}

\clearpage
\subsection{Spatial differentiation near computational boundaries}
\label{ssec:perimetrial_scheme}

\begin{figure}%the data are in the file /home/ivan/Documents/PhD_thesis/PhD_thesis/coeff_derivation/matlab file/syms_ex4.m
\centering
\psfrag{w}[B1][B1][1.][0]{$\kappa/\pi$}
\psfrag{d}[B1][B1][1.][0]{$\Re\left(\widetilde{f_{1}^{\prime F,B}}\right) /\pi$}
\psfrag{l}[B1][B1][1.][0]{$N_{\lambda}$}
\psfrag{b}[B1][B1][1.][0]{$\Im\left(\widetilde{f_{1}^{\prime F}}\right) /\pi$}
\psfrag{q}[B1][B1][1.][0]{$\varepsilon_{R}(\kappa)$}
\psfrag{f}[B1][B1][1.][0]{$\varepsilon_{I}(\kappa)$}
  \begin{subfigmatrix}{4}
\subfigure[\label{fig:intpref_disp}]{\includegraphics[width=0.49\textwidth]{fig/intpref_disp.eps}}
\subfigure[\label{fig:intpref_dissip}]{\includegraphics[width=0.49\textwidth]{fig/intpref_dissip.eps}}
\subfigure[\label{fig:intpref_disperr}]{\includegraphics[width=0.49\textwidth]{fig/intpref_err_disp.eps}}
\subfigure[\label{fig:intpref_disserr}]{\includegraphics[width=0.49\textwidth]{fig/intpref_err_diss.eps}}
\end{subfigmatrix}
\caption[Dispersive characteristics of the forward prefactored interior 11-point stencils for $i$-th node.]{Dispersive characteristics of the forward prefactored interior 11-point stencils for $i$-th node. (a) Real and (b) imaginary components of the Fourier
transform. (c) Dispersive error from eq.~\ref{eq:dispersive error def}. (d) Dissipative error from eq.~\ref{eq:dissipative error def}.}
\label{fig:pref_bc_interior}
\end{figure}

\clearpage
\subsection{Cost-optimized Runge-Kutta time integration}
\label{ssec:cost-optimized temporal solver}

\begin{figure}
\centering
 \psfrag{x}[B1][B1][1.0][0]{$\tilde{\epsilon}$}
 \psfrag{z}[B1][B1][1.0][0]{$\check{z}^{*}(\tilde{\epsilon})$}
 \psfrag{y}[B1][B1][1.0][0]{$\gamma_{3}^{*}$}
 \psfrag{w}[B1][B1][1.0][0]{$\gamma_{4}^{*}$}
\includegraphics[width=0.7\textwidth,keepaspectratio=true]{fig/zRK4opt.eps}
\caption[Optimal values of temporal resolving efficiencies $\check{z}^{*}(\tilde{\epsilon})$, $\gamma_{3}$ and $\gamma_{4}$ for second order, four stage optimized RK time integration scheme.]{Optimal values of temporal resolving efficiencies $\check{z}^{*}(\tilde{\epsilon})$ (black solid line with diamond symbols), $\gamma_{3}$ (black dash-dotted line with diamond symbols) and $\gamma_{4}$ (black dashed lines with diamond symbols) for second order, four stage optimized RK time integration scheme. The black lines and arrows without symbols indicate the corresponding coefficients for the classical RK4 scheme.}
\label{fig:zRK4opt}
\end{figure}

\clearpage
\section{Scheme analysis}
\label{sec:scheme analysis}

\begin{figure}%the layout is local2DC12RK4vsEpsm4.lay
\centering
 \psfrag{x}[B1][B1][1.4][0]{$\kappa$}
 \psfrag{y}[B1][B1][1.4][0]{$\sigma$}
 %cd $\psfrag{w}[B1][B1][1.2][0]{$\check{\kappa}$}
 %\psfrag{y}[B1][B1][1.2][0]{$\check{\sigma}$}
  \begin{subfigmatrix}{2}
\subfigure[\label{fig:local2DC12RK4vsEpsm4}]{\includegraphics[width=0.7\textwidth,keepaspectratio=true]{fig/local2DC12RK4vsEpsm4_new.eps}}
% the layout is local2DC12RK4epsm
\subfigure[\label{fig:locall2DC12RK4vsEpsm4Zoom}]{\includegraphics[width=0.7\textwidth,keepaspectratio=true]{fig/local2DC12RK4epsm4Zoom_new.eps}}
  \end{subfigmatrix}
  \caption[Contours of normalized `local' error function for the cost-optimized $\textsf{epsm4}$ scheme and for the corresponding non-optimal baseline $C1122/RK4$ scheme.]{(a) Contours of normalized `local' error function $e(\kappa, \sigma)$ for the cost-optimized $\textsf{epsm4}$ scheme (solid  line) and for the corresponding non-optimal baseline $C1122/RK4$ scheme (long-dashed line). Constant ratio $e(\kappa, \sigma)$ contour spacing of 0.7037 between $10^{-8}$ and $0.3$. The filled circles and the diamonds represent the corresponding `optimal' working conditions of the respective schemes for the two dimensional cost function of eq.~\ref{eq:optimal normalized cost}. (b) Enlarged view of the region near the design level of error $\tilde{\epsilon}=10^{-4}$ for the cost-optimized $\textsf{epsm4}$ scheme.}
  \label{fig:2DC12RK4vsEpsm4}%   Constant ratio $e(\kappa, \sigma)$ contour spacing  of 0.7037 between $10^{-8}$ and $0.3$. %No, there are 4 more level in the optimal design region.
\end{figure}

\begin{figure}% the layout is optimal_error_local2DC12RK4Epsm.lay
\centering
 \psfrag{x}[B1][B1][1.2][0]{$c_{2}^{*}$}
 \psfrag{z}[B1][B1][1.2][0]{$\tilde{\epsilon}$}
 \psfrag{w}[B1][B1][1.2][0]{$\kappa^{*}$}
 \psfrag{y}[B1][B1][1.2][0]{$\sigma^{*}$}
  \begin{subfigmatrix}{3}
\subfigure[\label{fig:optimal_error_local2DRK4epsm}]{\includegraphics[width=0.49\textwidth]{fig/optimal_error_local2DC12RK4Epsm_new.eps}}
\subfigure[\label{fig:optimal_wavenumber_local2DRK4epsm}]{\includegraphics[width=0.49\textwidth]{fig/optimal_wavenumber_local2DRK4Epsm_new.eps}}
\subfigure[\label{fig:optimal_courant_local2DRK4epsm}]{\includegraphics[width=0.7\textwidth]{fig/optimal_courant_local2DRK4Epsm_new.eps}}
\end{subfigmatrix}
\caption[Optimal `local' error, reduced wavenumber and Courant number as a function of the two-dimensional cost for the baseline $C1122/RK4$ scheme and the cost-optimized $\textsf{epsm5}$, $\textsf{epsm4}$, $\textsf{epsm3}$ schemes.]{(a) Optimal `local' error, (b) reduced wavenumber and (c) Courant number as a function of the two-dimensional cost for the baseline $C1122/RK4$
scheme and the cost-optimized $\textsf{epsm5}$, $\textsf{epsm4}$, $\textsf{epsm3}$ schemes. Line patterns as in Fig.~\ref{fig:stabilityRKopt}.}
\label{fig:optimal_local2DRK4Epsm}
\end{figure}

\clearpage
\subsection{Numerical stability}
\label{ssec:numerical stability}

\begin{figure}% the data are in the file /home/ivan/Documents/PhD_thesis/PhD_thesis/coeff_derivation/matlab file/syms_ex4.m($\alpha_{1}=1/3$ in eq.~\ref{eq:sixth-order})
\centering
\psfrag{x}[B1][B1][1.][0]{$\Re\left( s^{*} \right) $}
\psfrag{y}[B1][B1][1.][0]{$\Im\left( s^{*} \right) $}
  \begin{subfigmatrix}{4}
\subfigure[\label{fig:c1122_3553}]{\includegraphics[width=0.49\textwidth]{fig/C1122_3553.eps}}
\subfigure[\label{fig:c1122epm5_3553}]{\includegraphics[width=0.49\textwidth]{fig/C1122epsm5_3553.eps}}
\subfigure[\label{fig:c1122epsm4_3553}]{\includegraphics[width=0.49\textwidth]{fig/C1122epsm4_3553.eps}}
\subfigure[\label{fig:c1122epsm3_3553}]{\includegraphics[width=0.49\textwidth]{fig/C1122epsm3_3553.eps}}
\end{subfigmatrix}
\caption[Eigenvalue spectrum for the classical $C1122$ scheme and the cost-optimized $C12$\textsf{epsm}$n$ schemes, with $n=5,4,3$.]{Eigenvalue spectrum for the classical $C1122$ scheme and the cost-optimized $C12$\textsf{epsm}$n$ schemes, with $n=5,4,3$. $\left( \triangle \right)$  $N=21$, $\left( \circ \right)$ $N=41$, $\left( \diamond \right)$ $N=81$, $ \left( \triangledown \right)$ $N=201$, $ \left( \square \right)$ $N=401$  (a) $C1122$~\cite{carpenter-gottlieb-93-JCP}. (b) $C1122$\textsf{epsm}$5$. (c) $C1122$\textsf{epsm}$4$. (d) $C1122$\textsf{epsm}$3$.}
\label{fig:eigenvalue_c1122}
\end{figure}

\clearpage
\section{Numerical tests}
\label{sec:numerical_tests}

\subsection{Verification of numerical error levels}
\label{ssec:verification_error_levels}
Verification of the numerical error levels across the full wavenumber and Courant number envelope. Verification of the numerical error levels of the optimized schemes at their design optimal set point operations.

\begin{figure}
 \centering
 \psfrag{y}[B1][B1][1.][0]{$\sigma$}
 \psfrag{x}[B1][B1][1.][0]{$\kappa$}
\includegraphics[width=0.7\textwidth,keepaspectratio=true]{fig/isomap_grid.eps}
  \caption{Numerical grid used for the computed iso-maps reported in Fig~\ref{fig:1dwave_C1122RK4epsm}.}
  \label{fig:isomap grid}
\end{figure}%

\begin{figure}%[ht]% the layput is /home/ivan/Documents/PhD_thesis/trunk/data/multi_domain/teorVSnum_costoptimal1ds in.lay
\centering
 \psfrag{x}[B1][B1][1.2][0]{$\tilde{c}_{1}^{*}$}
 \psfrag{z}[B1][B1][1.2][0]{$\tilde{\epsilon}$}
 \psfrag{w}[B1][B1][1.2][0]{$\kappa^{*}$}
 \psfrag{y}[B1][B1][1.2][0]{$\sigma^{*}$}
 \psfrag{t}[B1][B1][1.2][0]{Meas. Time}
  \begin{subfigmatrix}{3}
\subfigure[\label{fig:teorVsnum_costoptimal1dsin}]{\includegraphics[width=0.49\textwidth]{fig/teorVsnum_costoptimal1dsin.eps}}
\subfigure[\label{fig:optimal_wavenumber_local1Dsin}]{\includegraphics[width=0.49\textwidth]{fig/teorVsnum_wavenumberoptimal1dsin.eps}}
\subfigure[\label{fig:optimal_courant_local1Dsin}]{\includegraphics[width=0.49\textwidth]{fig/teorVsnum_sigmaoptimal1dsin.eps}}
\end{subfigmatrix}
\caption[Comparison of the theoretical and numerical optimal `local' error versus cost, reduced wavenumber and Courant number as
a function of the one-dimensional cost for the monochromatic sinusoidal wave.]{Comparison of the theoretical (lines) and numerical (symbols) optimal `local' error versus cost (a), reduced wavenumber (b) and Courant number (c) as
a function of the one-dimensional cost for the monochromatic sinusoidal wave. $\left(-, \diamond \right)$ $C1122$, $\left( - -, \triangledown \right)$
$C1122$\textsf{epsm}$5$, $\left( - \cdot -,  \circ \right)$ $C1122$\textsf{epsm}$4$, $\left( -\cdot \cdot -, \square  \right)$
$C1122$\textsf{epsm}$3$.}
\label{fig:teorVsnum_optimal1dsin}
\end{figure}

\subsection{Assessment of computational cost}
\label{ssec:assessment_computational_cost}
Regressing the measured computational cost on the analytical cost function. Verification that the analytical cost function is a good pseudo-variable for predicting the variation of the computational cost in the numerical tests with wavenumber and Courant number.

\begin{figure}%[ht]% the layput is /home/ivan/Documents/PhD_thesis/trunk/data/multi_domain/teorVSnum_costoptimal1ds in.lay
\centering
 \psfrag{x}[B1][B1][1.2][0]{$\tilde{c}_{1}^{*}$}
 \psfrag{z}[B1][B1][1.2][0]{$\tilde{\epsilon}$}
 \psfrag{w}[B1][B1][1.2][0]{$\kappa^{*}$}
 \psfrag{y}[B1][B1][1.2][0]{$\sigma^{*}$}
 \psfrag{m}[B1][B1][0.8][0]{$time$}
 \psfrag{n}[B1][B1][1.0][0]{$\overline{L}_{2}$}
  \begin{subfigmatrix}{3}
\subfigure[\label{fig:teorVsnum_costoptimal1dsinBis}]{\includegraphics[width=0.49\textwidth]{fig/teorVsnum_costoptimal1dsin.eps}}
\subfigure[\label{fig:elapsedtime_local1Dsin}]{\includegraphics[width=0.49\textwidth]{fig/elapsed_time.eps}}
\subfigure[\label{fig:elapsedtime_local1DsinZoom}]{\includegraphics[width=0.49\textwidth]{fig/elapsed_time_Zoom.eps}}
\end{subfigmatrix}
\caption[Theoretical and numerical optimal `local' error versus cost as a function of the one-dimensional cost function, and measured
elapsed time versus the normalized computed $\overline{L}_{2}$ norm error for the monochromatic sinusoidal wave.]{Theoretical (lines) and numerical (symbols) optimal `local' error versus cost as a function of the one-dimensional cost function (a), and measured
elapsed time versus the normalized computed $\overline{L}_{2}$ norm error (b) for the monochromatic sinusoidal wave. $\left(-, \diamond \right)$ $C1122$,
$\left( - -, \triangledown \right)$ $C1122$\textsf{epsm}$5$, $\left( - \cdot -,\circ \right)$ $C1122$\textsf{epsm}$4$, $\left( -\cdot \cdot -, \square
\right)$ $C1122$\textsf{epsm}$3$. (c) Zoom of the rectangular area reported in (b).}
\label{fig:costVsTime_optimal1dsin}
\end{figure}%

\begin{table}
\begin{footnotesize}
\begin{center}
\caption{Measured computational elapsed $time$ in secs. and computed normalized $\overline{L}_{2}$ norm error for the classical $C1122$ and the cost-optimized
schemes $\textsf{epsm}n$ at the computed optimal cost-error operational points. Final non-dimensional time $T=1$.}
  \label{tab:optimal mesured time}
 \begin{tabular}{|l|l|l|l|l|l|l|l|l|}
\hline
$\tilde{\epsilon}$
&\multicolumn{2}{c|}{$C1122/RK4$}&\multicolumn{2}{c|}{$\textsf{epsm}5$}
&\multicolumn{2}{c|}{$\textsf{epsm}4$}& \multicolumn{2}{c|}{$\textsf{epsm}3$}\\
\cline{2-9}
 &$time$& $\overline{L}_{2}$ & $time$ & $\overline{L}_{2}$ & $time$ &
$\overline{L}_{2}$ & $time$ & $\overline{L}_{2}$ \\
\hline
$10^{-8}$ & 4.4E-02 & 1.0330E-08 & 0.866   & 1.0754E-08 & 1.765   & 1.0731E-08 & 4.6     & 1.0889E-08 \\
$10^{-7}$ & 2.0E-02 & 9.7255E-08 & 0.163   & 9.1660E-08 & 0.297   & 1.0133E-07 & 0.6     & 1.0122E-07 \\
$10^{-6}$ & 1.2E-02 & 9.8504E-07 & 3.1E-02 & 1.0158E-06 & 6.0E-02 & 9.0841E-07 & 9.2E-02 & 1.0214E-06 \\
$10^{-5}$ & 6.0E-03 & 1.0024E-05 & 5.5E-03 & 1.0458E-05 & 1.4E-02 & 9.8967E-06 & 2.0E-02 & 1.0095E-05 \\
$10^{-4}$ & 4.0E-03 & 1.0253E-04 & 3.0E-03 & 9.8729E-05 & 3.7E-03 & 1.0143E-04 & 6.8E-03 & 1.0132E-04 \\
$10^{-3}$ & 2.6E-03 & 1.0019E-03 & 2.3E-03 & 9.5922E-04 & 2.3E-03 & 8.4095E-04 & 2.5E-03 & 1.0034E-03 \\
$10^{-2}$ & 2.0E-03 & 8.6665E-03 & 1.9E-03 & 6.9636E-03 & 1.7E-03 & 1.3702E-02 & 1.9E-03 & 1.0122E-02 \\
$10^{-1}$ & 1.5E-03 & 9.8487E-02 & 1.4E-03 & 9.5154E-02 & 1.4E-03 & 9.0776E-02 & 1.4E-03 & 1.0079E-01 \\
\hline
\end{tabular}
 \end{center}
\end{footnotesize}
\end{table}

 \begin{table}
  \begin{center}
\caption{Comparison of the percentage cost-saving $\Delta \tilde{c}^{*}_{1}$ and measured elapsed time-saving $\Delta t\%$ of the cost optimized schemes with
respect to the classical baseline scheme, at the nominal optimal design levels of error $\tilde{\epsilon}$. $\Delta \tilde{c}^{*}_{1}$ from
Tab.~\ref{tab:optimal teo vs num optimal points}, and $\Delta t\%$ from Tab.~\ref{tab:optimal mesured long time}.}
 \label{tab:optimal cost and time}
%\begin{tabular*}{\linewidth}{ @{\extracolsep{\fill}} ll *{13}c @{}}
% \multirow{3}{*}          & \begin{tabular}{|l|l|l|l|l|l|}
\begin{tabular}{|l|l|l|l|l|l|l|} \hline
$\tilde{\epsilon}$ &\multicolumn{5}{|c|}{$\textsf{epsm}5$} \\ \cline{2-6}
 & $\Delta \tilde{c}^{*}_{1}\%$ & $\Delta t\%_{T=1}$ &  $\Delta t\%_{T=10}$ &
$\Delta t\%_{T=100}$ & $\Delta t\%_{T=500}$ \\ \hline
$10^{-5}$ & 22.3 & 11.29 & 21.42  & 23.45 & 23.66 \\ \hline
%\end{tabular}  \\
%                          &  \begin{tabular}{|l|l|l|l|l|l|}
$\tilde{\epsilon}$ &\multicolumn{5}{|c|}{$\textsf{epsm}4$} \\ \cline{2-6}
 & $\Delta \tilde{c}^{*}_{1}\%$ & $\Delta t\%_{T=1}$ &  $\Delta t\%_{T=10}$ &
$\Delta t\%_{T=100}$ & $\Delta t\%_{T=500}$ \\ \hline
$10^{-4}$ & 7.83 & 7.5 & 8.6 & 9.67 & 10.6 \\ \hline
%\end{tabular} \\
%                          & \begin{tabular}{|l|l|l|l|l|l|}
$\tilde{\epsilon}$ &\multicolumn{5}{c|}{$\textsf{epsm}3$} \\ \cline{2-6}
 & $\Delta \tilde{c}^{*}_{1}\%$ & $\Delta t\%_{T=1}$ &  $\Delta t\%_{T=10}$ &
$\Delta t\%_{T=100}$ & $\Delta t\%_{T=500}$  \\ \hline
$10^{-3}$ & 11.24 & 4.23 & 10.52 & 12.79 & 16.41 \\ \hline
\end{tabular}  \\
% \end{tabular*}
  \end{center}
 \end{table}

\subsection{Verification of the optimal set-point operation}
\label{ssec:optimal_set_point_operation}
Comparison between the optimal Courant and wavenumber combination for optimized schemes to operate at their minimum computational cost for a given error level and the combination of Courant and wavenumber that was identified as optimal in the numerical tests.

\begin{figure}
\centering
 \psfrag{x}[B1][B1][1.2][0]{$\kappa$}
 \psfrag{y}[B1][B1][1.2][0]{$\sigma$}
  \begin{subfigmatrix}{2}
\subfigure[\label{fig:optimalC1122Level8teorA}$C1122/RK4$.]{\includegraphics[width=0.7\textwidth]{fig/isomap_comparisonC1122_teorvsNum_T=1.eps}}
%\subfigure[\label{fig:optimalC1122Level8teorB}]{\includegraphics[width=8.0cm]{Chapter4/fig/isomap_comparisonC1122_teorvsNumoptimal_T=1.eps}}
\subfigure[\label{fig:optimalEpsm5Level8teorA}\textsf{epsm}$5$.]{\includegraphics[width=0.7\textwidth]{fig/isomap_comparisonEpsm5_teorvsNum_T=1.eps}}
%\subfigure[\label{fig:optimalEpsm5Level8teorB}]{\includegraphics[width=8.0cm]{Chapter4/fig/isomap_comparisonEpsm5_teorvsNumoptimal_T=1.eps}}
\end{subfigmatrix}
\caption[Theoretical and numerical contours of optimal `local' error error function as a function of the one-dimensional cost for the monochromatic sinusoidal wave.]{Theoretical (black solid lines) and numerical (black dashed lines) contours of optimal `local' error function $e(\kappa,\sigma)$ as a function of the one-dimensional cost $c_{1}=1/(\sigma \kappa^2)$ (continuous coloured lines) for the monochromatic sinusoidal wave. (a) $C1122/RK4$ (b) \textsf{epsm}$5$.}
% (a) Tangency condition between iso-contours and iso-error lines.  (b) Optimal cost-error operational points.}
\label{fig:optimal_level8}
\end{figure}%

\begin{table}%[h]
\begin{footnotesize}
 \begin{center}
  \caption{Absolute percentage difference $\Delta c^{*}_{1}$ between theoretical and numerical cost-optimal operating points as function of the one-dimensional
cost for the monochromatic sinusoidal wave. The brackets report, respectively, theoretical and numerical cost-optimal values. The bold values are used in following Tab.~\ref{tab:optimal mesured long time}.}
  \label{tab:optimal teo vs num optimal points}
  \begin{tabular}{|l|l|l|l|l|l|l|l} \hline % no placement specified: defaults to here, top, bottom, page \pm is a function of the level error
 $\tilde{\epsilon}\backslash$scheme & $C1122/RK4$ & $\textsf{epsm}5$           &
$\textsf{epsm}4$                       & $\textsf{epsm}3$             \\ \hline
  $10^{-8}$   &   0.4  (242.14, 243.15)       &  41.19  (4847.74, 6844.58)
    &  47.47  (9480.87, 13982.07)      & 90.98 (16551.3825, 31611.26) \\
  $10^{-7}$   &   1.2  (92.36, 93.52)         &  24.23  (848.37, 1053.96)
    &  152.02 (848.37,  2138.07)       & 45.32 (2931.408,   4260.07)  \\
  $10^{-6}$   &   1.9  (35.40, 36.09)         &  12.53  (147.70, 166.22)
    &  127.51 (147.70,  336.06)        & 19.31 (519.103,    619.37)   \\
  $10^{-5}$   &   2.6  (13.58, \textbf{13.94})&  13.8   (9.51, \textbf{10.83})
    &  473.44 (9.51,    54.57)         &  9.56 (91.632,     100.39)   \\
  $10^{-4}$   &   3.6  (5.21, \textbf{5.40})  &  5.60   (3.41, 3.6)
    &  45.67  (3.4165,  \textbf{4.977})&  6.00 (15.73,      16.68)    \\
  $10^{-3}$   &   4.8  (2.00, \textbf{2.09})  &  4.38   (1.69, 1.77)
    &  14.74  (1.6985,  1.448)         &  6.54 (1.74, \textbf{1.855})    \\
  $10^{-2}$   &   5.7  (0.76, 0.8)            &  7.43   (0.71, 0.76)
    &  2.24   (0.7125,  0.6965)        &  2.12 (0.54,        0.5515)   \\
  $10^{-1}$   &   19.0 (0.25, 0.20)           &  18.05  (0.24, 0.20)
    &  20.89  (0.2465,  0.195)         & 13.06 (0.21,        0.183)    \\ \hline
  \end{tabular}\\
 \end{center}
\end{footnotesize}
\end{table}


\section{Predicted performance of optimized prefactored schemes in N space dimensions}
\label{sec:n-space_applications}

\begin{figure}
\centering
 \psfrag{x}[B1][B1][1.2][0]{$\kappa$}
 \psfrag{y}[B1][B1][1.2][0]{$\sigma$}
\includegraphics[width=0.7\textwidth,keepaspectratio=true]{fig/approxlocalC12epsm4.eps}
\caption[Iso-level of the normalized `local' error function $e(\kappa, \sigma)$ for the cost-optimized $\textsf{epsm4}$ scheme and of the cost function in two dimensional space.]{Iso-level of the normalized `local' error function $e(\kappa, \sigma)$ for the cost-optimized $\textsf{epsm4}$ scheme (solid line) and of the cost
function in two dimensional space (dash-dotted line). `Optimal' (filled solid circle) and approximate working condition (open circle) at the design level of
error $\tilde{\epsilon}=10^{-4}$. The dashed and dotted lines represent, respectively, the approximations to the dash-dotted line given by
eqs.~\ref{eq:error space Lele} and~\ref{eq:error time Hu}.}
\label{fig:approxlocalC12epsm4}
\end{figure}

\section{Conclusion}
\label{sec:conclusion}

%% The Appendices part is started with the command \appendix;
%% appendix sections are then done as normal sections
%% \appendix

%% \section{}
%% \label{}

%% If you have bibdatabase file and want bibtex to generate the
%% bibitems, please use
%%
\section*{References}
\bibliographystyle{elsarticle-num}
\bibliography{bib_IS}

%% else use the following coding to input the bibitems directly in the
%% TeX file.

%% \begin{thebibliography}{00}

%% \bibitem{label}
%% Text of bibliographic item

%% \bibitem{}

%% \end{thebibliography}
\end{document}
\endinput
%%
%% End of file `elsarticle-template-num.tex'.
